Goto

Collaborating Authors

 Hatay Province






A Deep Learning framework for building damage assessment using VHR SAR and geospatial data: demonstration on the 2023 Turkiye Earthquake

Russo, Luigi, Tapete, Deodato, Ullo, Silvia Liberata, Gamba, Paolo

arXiv.org Artificial Intelligence

Building damage identification shortly after a disaster is crucial for guiding emergency response and recovery efforts. Although optical satellite imagery is commonly used for disaster mapping, its effectiveness is often hampered by cloud cover or the absence of pre-event acquisitions. To overcome these challenges, we introduce a novel multimodal deep learning (DL) framework for detecting building damage using single-date very high resolution (VHR) Synthetic Aperture Radar (SAR) imagery from the Italian Space Agency (ASI) COSMO SkyMed (CSK) constellation, complemented by auxiliary geospatial data. Our method integrates SAR image patches, OpenStreetMap (OSM) building footprints, digital surface model (DSM) data, and structural and exposure attributes from the Global Earthquake Model (GEM) to improve detection accuracy and contextual interpretation. Unlike existing approaches that depend on pre and post event imagery, our model utilizes only post event data, facilitating rapid deployment in critical scenarios. The framework effectiveness is demonstrated using a new dataset from the 2023 earthquake in Turkey, covering multiple cities with diverse urban settings. Results highlight that incorporating geospatial features significantly enhances detection performance and generalizability to previously unseen areas. By combining SAR imagery with detailed vulnerability and exposure information, our approach provides reliable and rapid building damage assessments without the dependency from available pre-event data. Moreover, the automated and scalable data generation process ensures the framework's applicability across diverse disaster-affected regions, underscoring its potential to support effective disaster management and recovery efforts. Code and data will be made available upon acceptance of the paper.


Shifting the Paradigm: A Diffeomorphism Between Time Series Data Manifolds for Achieving Shift-Invariancy in Deep Learning

Demirel, Berken Utku, Holz, Christian

arXiv.org Artificial Intelligence

Deep learning models lack shift invariance, making them sensitive to input shifts that cause changes in output. While recent techniques seek to address this for images, our findings show that these approaches fail to provide shift-invariance in time series, where the data generation mechanism is more challenging due to the interaction of low and high frequencies. Worse, they also decrease performance across several tasks. In this paper, we propose a novel differentiable bijective function that maps samples from their high-dimensional data manifold to another manifold of the same dimension, without any dimensional reduction. Our approach guarantees that samples -- when subjected to random shifts -- are mapped to a unique point in the manifold while preserving all task-relevant information without loss. We theoretically and empirically demonstrate that the proposed transformation guarantees shift-invariance in deep learning models without imposing any limits to the shift. Our experiments on six time series tasks with state-of-the-art methods show that our approach consistently improves the performance while enabling models to achieve complete shift-invariance without modifying or imposing restrictions on the model's topology. The source code is available on \href{https://github.com/eth-siplab/Shifting-the-Paradigm}{GitHub}.


BRIGHT: A globally distributed multimodal building damage assessment dataset with very-high-resolution for all-weather disaster response

Chen, Hongruixuan, Song, Jian, Dietrich, Olivier, Broni-Bediako, Clifford, Xuan, Weihao, Wang, Junjue, Shao, Xinlei, Wei, Yimin, Xia, Junshi, Lan, Cuiling, Schindler, Konrad, Yokoya, Naoto

arXiv.org Artificial Intelligence

Disaster events occur around the world and cause significant damage to human life and property. Earth observation (EO) data enables rapid and comprehensive building damage assessment (BDA), an essential capability in the aftermath of a disaster to reduce human casualties and to inform disaster relief efforts. Recent research focuses on the development of AI models to achieve accurate mapping of unseen disaster events, mostly using optical EO data. However, solutions based on optical data are limited to clear skies and daylight hours, preventing a prompt response to disasters. Integrating multimodal (MM) EO data, particularly the combination of optical and SAR imagery, makes it possible to provide all-weather, day-and-night disaster responses. Despite this potential, the development of robust multimodal AI models has been constrained by the lack of suitable benchmark datasets. In this paper, we present a BDA dataset using veRy-hIGH-resoluTion optical and SAR imagery (BRIGHT) to support AI-based all-weather disaster response. To the best of our knowledge, BRIGHT is the first open-access, globally distributed, event-diverse MM dataset specifically curated to support AI-based disaster response. It covers five types of natural disasters and two types of man-made disasters across 12 regions worldwide, with a particular focus on developing countries where external assistance is most needed. The optical and SAR imagery in BRIGHT, with a spatial resolution between 0.3-1 meters, provides detailed representations of individual buildings, making it ideal for precise BDA. In our experiments, we have tested seven advanced AI models trained with our BRIGHT to validate the transferability and robustness. The dataset and code are available at https://github.com/ChenHongruixuan/BRIGHT. BRIGHT also serves as the official dataset for the 2025 IEEE GRSS Data Fusion Contest.


Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking

Zhang, Yuwei, Xia, Tong, Han, Jing, Wu, Yu, Rizos, Georgios, Liu, Yang, Mosuily, Mohammed, Chauhan, Jagmohan, Mascolo, Cecilia

arXiv.org Artificial Intelligence

Respiratory audio, such as coughing and breathing sounds, has predictive power for a wide range of healthcare applications, yet is currently under-explored. The main problem for those applications arises from the difficulty in collecting large labeled task-specific data for model development. Generalizable respiratory acoustic foundation models pretrained with unlabeled data would offer appealing advantages and possibly unlock this impasse. However, given the safety-critical nature of healthcare applications, it is pivotal to also ensure openness and replicability for any proposed foundation model solution. To this end, we introduce OPERA, an OPEn Respiratory Acoustic foundation model pretraining and benchmarking system, as the first approach answering this need. We curate large-scale respiratory audio datasets ( 136K samples, 440 hours), pretrain three pioneering foundation models, and build a benchmark consisting of 19 downstream respiratory health tasks for evaluation. Our pretrained models demonstrate superior performance (against existing acoustic models pretrained with general audio on 16 out of 19 tasks) and generalizability (to unseen datasets and new respiratory audio modalities). This highlights the great promise of respiratory acoustic foundation models and encourages more studies using OPERA as an open resource to accelerate research on respiratory audio for health. The system is accessible from https://github.


Class Attendance System in Education with Deep Learning Method

Demir, Hüdaverdi, Savaş, Serkan

arXiv.org Artificial Intelligence

With the advancing technology, the hardware gain of computers and the increase in the processing capacity of processors have facilitated the processing of instantaneous and real-time images. Face recognition processes are also studies in the field of image processing. Facial recognition processes are frequently used in security applications and commercial applications. Especially in the last 20 years, the high performances of artificial intelligence (AI) studies have contributed to the spread of these studies in many different fields. Education is one of them. The potential and advantages of using AI in education; can be grouped under three headings: student, teacher, and institution. One of the institutional studies may be the security of educational environments and the contribution of automation to education and training processes. From this point of view, deep learning methods, one of the sub-branches of AI, were used in this study. For object detection from images, a pioneering study has been designed and successfully implemented to keep records of students' entrance to the educational institution and to perform class attendance with images taken from the camera using image processing algorithms. The application of the study to real-life problems will be carried out in a school determined in the 2022-2023 academic year.


Aligning Language Models with Preferences through f-divergence Minimization

Go, Dongyoung, Korbak, Tomasz, Kruszewski, Germán, Rozen, Jos, Ryu, Nahyeon, Dymetman, Marc

arXiv.org Artificial Intelligence

Aligning language models with preferences can be posed as approximating a target distribution representing some desired behavior. Existing approaches differ both in the functional form of the target distribution and the algorithm used to approximate it. For instance, Reinforcement Learning from Human Feedback (RLHF) corresponds to minimizing a reverse KL from an implicit target distribution arising from a KL penalty in the objective. On the other hand, Generative Distributional Control (GDC) has an explicit target distribution and minimizes a forward KL from it using the Distributional Policy Gradient (DPG) algorithm. In this paper, we propose a new approach, f-DPG, which allows the use of any f-divergence to approximate any target distribution that can be evaluated. f-DPG unifies both frameworks (RLHF, GDC) and the approximation methods (DPG, RL with KL penalties). We show the practical benefits of various choices of divergence objectives and demonstrate that there is no universally optimal objective but that different divergences present different alignment and diversity trade-offs. We show that Jensen-Shannon divergence strikes a good balance between these objectives, and frequently outperforms forward KL divergence by a wide margin, leading to significant improvements over prior work. These distinguishing characteristics between divergences persist as the model size increases, highlighting the importance of selecting appropriate divergence objectives.